# Video Action Recognition
Videomae Base Finetuned Kinetics 0409 Final 5sec Org Ab7 Val Inside Train
This model is a fine-tuned version based on MCG-NJU/videomae-base-finetuned-kinetics, primarily used for video understanding tasks, achieving an accuracy of 91.38% on the evaluation set.
Video Processing
Transformers

V
d2o2ji
17
0
Videomae Base Finetuned Kinetics 0409 Final 5sec Org Ab7 Val As123 Retry
A video understanding model fine-tuned based on MCG-NJU/videomae-base-finetuned-kinetics, achieving 91.23% accuracy on the evaluation set
Video Processing
Transformers

V
d2o2ji
30
0
Videomae Base Finetuned Ucf101 Subset
Video classification model fine-tuned on a subset of UCF101 based on the VideoMAE base model
Video Processing
Transformers

V
cccchristopher
30
0
Videomae Base Finetuned Kinetics 0408 Final 5sec Org Ab7 Val As123
A video action recognition model based on the VideoMAE architecture, fine-tuned on the Kinetics dataset with an accuracy of 92.25%
Video Processing
Transformers

V
d2o2ji
31
0
Videomae Base Finetuned Kinetics 0408 Final 45sec Org
A video understanding model fine-tuned based on MCG-NJU/videomae-base-finetuned-kinetics, achieving an accuracy of 90.97% on the evaluation set
Video Processing
Transformers

V
d2o2ji
26
0
Videomae Base Finetuned Ucf101 Subset
A video understanding model fine-tuned on a subset of the UCF101 action recognition dataset based on the VideoMAE base model
Video Processing
Transformers

V
ihsanahakiim
39
0
Timesformer Hr Finetuned K600
TimeSformer-HR is a video action recognition model optimized for high-resolution videos and fine-tuned on the Kinetics-600 dataset.
Video Processing
Transformers

T
onnx-community
17
0
Timesformer Hr Finetuned K400
TimeSformer-HR is a high-resolution spatiotemporal Transformer model for video, fine-tuned on the Kinetics-400 dataset, suitable for video action recognition tasks.
Video Processing
Transformers

T
onnx-community
17
0
Timesformer Base Finetuned Ssv2
TimeSformer is a Transformer-based video understanding model specifically optimized for temporal action recognition tasks.
Video Processing
Transformers

T
onnx-community
17
0
Timesformer Base Finetuned K600
TimeSformer is a video understanding model based on the Transformer architecture, specifically designed for video classification tasks.
Video Processing
Transformers

T
onnx-community
16
0
Timesformer Base Finetuned K400
TimeSformer is a Transformer-based video understanding model, specifically fine-tuned on the Kinetics-400 dataset.
Video Processing
Transformers

T
onnx-community
17
0
Vivit B 16x2 Kinetics400 Finetuned Cctv Surveillance
MIT
A video action recognition model based on the ViViT architecture, fine-tuned specifically for CCTV surveillance scenarios, excelling in action recognition tasks.
Video Processing
Transformers

V
ratchy-oak
1,939
1
Videomae Base Finetuned Kinetics Finetuned Dcsass Shoplifting Subset
A video classification model based on the VideoMAE architecture, fine-tuned specifically for shoplifting behavior detection
Video Processing
Transformers

V
Abdullah1
23
0
Videomae Base Finetuned Kinetics Finetuned Fall Detect
A video action recognition model based on the VideoMAE architecture, specifically fine-tuned for fall detection tasks
Video Processing
Transformers

V
yadvender12
105
1
Athit Timesformer 32PS
TimeSformer is a video understanding model based on spatial-temporal attention mechanism, fine-tuned on the Kinetics-400 dataset, suitable for video classification tasks.
Video Processing
Transformers

A
mbushee
17
0
Timesformer Base Finetuned K400 Finetuned Asl
This model is a video classification model fine-tuned based on facebook/timesformer-base-finetuned-k400, achieving an accuracy of 96.25% on the evaluation set.
Video Processing
Transformers

T
Krithiik
74
0
Timesformer Base Finetuned K400 Continual Lora Ucf101 Continual Lora Ucf101
A video action recognition model based on TimeSformer architecture, pre-trained on Kinetics-400 dataset and fine-tuned on UCF101 dataset
Video Processing
Transformers

T
NiiCole
18
0
Timesformer Base Finetuned K400 Continual Lora Ucf101
A video classification model based on the TimeSformer architecture, pre-trained on the Kinetics-400 dataset and fine-tuned on the UCF101 dataset, utilizing LoRA technology for continual learning.
Video Processing
Transformers

T
NiiCole
17
0
Timesformer Base Finetuned K400 Finetuned Olimpics Sport Subset
A video action recognition model based on TimeSformer architecture, pre-trained on Kinetics-400 dataset and fine-tuned for Olympic sports subset
Video Processing
Transformers

T
IsraelSonseca
25
0
Videomae Small Finetuned Ssv2
VideoMAE is a self-supervised pretrained video model based on Masked Autoencoder (MAE), fine-tuned on the Something-Something V2 dataset for video classification tasks.
Video Processing
Transformers

V
MCG-NJU
140
0
Videomae Base Finetuned Ucf101 Subset
A video classification model fine-tuned on a subset of UCF101 based on the VideoMAE base model
Video Processing
Transformers

V
koya1
14
0
Videomae Base Finetuned Ucf101 Subset
A video understanding model fine-tuned on a subset of UCF101 based on the VideoMAE base model, achieving 95.71% accuracy
Video Processing
Transformers

V
anitavero
14
0
Videomae Base Finetuned Ucf101 Subset
A video classification model fine-tuned on a subset of UCF101 based on the VideoMAE base model, achieving 95.22% accuracy
Video Processing
Transformers

V
burcusu
17
2
Videomae Base Short Finetuned Ssv2 Finetuned Rwf2000 Epochs8 Batch8 Fp16
Video action recognition model based on VideoMAE architecture, pre-trained on SSv2 dataset and further fine-tuned on RWF-2000 dataset
Video Processing
Transformers

V
lmazzon70
14
0
Videomae Base Ssv2 Finetuned Rwf2000
A video understanding model based on the VideoMAE architecture, fine-tuned on the RWF-2000 dataset for violence detection tasks
Video Processing
Transformers

V
lmazzon70
30
0
Timesformer Large Finetuned K400
TimeSformer is a video classification model based on spatio-temporal attention mechanism, specifically designed for video understanding tasks.
Video Processing
Transformers

T
fcakyon
254
0
Timesformer Base Finetuned K400
TimeSformer is a video classification model based on spatio-temporal attention mechanism, specifically fine-tuned for the Kinetics-400 dataset.
Video Processing
Transformers

T
fcakyon
17
0
Timesformer Hr Finetuned K600
TimeSformer is a video understanding model based on spatiotemporal attention mechanisms, with its high-resolution variant specifically fine-tuned for the Kinetics-600 dataset.
Video Processing
Transformers

T
fcakyon
22
0
Videomae Base Finetuned Ucf101
MIT
Video action recognition model fine-tuned on UCF101 dataset based on VideoMAE Base model
Video Processing
Transformers English

V
nateraw
130
1
Videomae Base Finetuned Ucf101 Subset
Video classification model based on VideoMAE architecture, fine-tuned on a subset of UCF101 with an accuracy of 85.16%
Video Processing
Transformers

V
nateraw
77
0
Timesformer Hr Finetuned K600
TimeSformer is a video classification model based on spatio-temporal attention mechanisms, specifically designed for video understanding tasks.
Video Processing
Transformers

T
facebook
2,927
6
Featured Recommended AI Models